Statistical Parsing with Context-Free Filtering Grammar
نویسندگان
چکیده
Statistical Parsing with Context-Free Filtering Grammar Michael Demko Master of Science Graduate Department of Computer Science University of Toronto 2007 Statistical parsers that simultaneously generate both phrase-structure and lexical dependency trees have been limited in two important ways: the detection of non-projective dependencies has not been integrated with other parsing decisions, or the constraints between phrase-structure and dependency structure have been overly strict. I develop context-free filtering grammar as a generalization of the more restrictive lexicalized factored parsing model, and I develop for the new grammar formalism a scoring model to resolve parsing ambiguities. I demonstrate the flexibility of the new model by implementing a statistical parser for German, a freer-word-order language exhibiting a mixture of context-free and non-projective behaviours.
منابع مشابه
Dependency Parsing Resources for French: Converting Acquired Lexical Functional Grammar F-Structure Annotations and Parsing F-Structures Directly
Recent years have seen considerable success in the generation of automatically obtained wide-coverage deep grammars for natural language processing, given reliable and large CFG-like treebanks. For research within Lexical Functional Grammar framework, these deep grammars are typically based on an extended PCFG parsing scheme from which dependencies are extracted. However, increasing success in ...
متن کاملStatistical Parsing with a Grammar Acquired from a Bracketed Corpus Based on Clustering Analysis
This paper proposes a new method for learning a context-sensitive conditional probability context-free grammar from an unlabeled bracketed corpus based on clustering analysis and describes a natural language parsing model which uses a probability-based scoring function of the grammar to rank parses of a sentence. By grouping brackets in a corpus into a number of similar bracket groups based on ...
متن کاملAcquiring a Stochastic Context-Free Grammar from the Penn Treebank
In this paper we present preliminary results of investigating the structure of the Penn Treebank and how these results can be used in probabilistic parsing of English. Penn Treebank is a corpus of 4.9 million part-of-speech (POS) tagged words and 2.9 million words of skeletally parsed data developed by the University of Pennsylvania (see 8]). By matching skeletal parse les with POS-tagged les w...
متن کاملGrammar Acquisition and Statistical Parsing by exploiting Local Contextual Information
This paper presents a method for inducing a context-sensitive conditional probability context-free grammar from an unlabeled bracketed corpus using local contextual information and describes a natural language parsing model which uses a probabilitybased scoring function of the grammar to rank parses of a sentence. This method uses clustering techniques to group brackets in a corpus into a numbe...
متن کاملA Probabilistic Context-Free Grammar for Melodic Reduction
This article presents a method used to find tree structures in musical scores using a probabilistic grammar for melodic reduction. A parsing algorithm is used to find the optimal parse of a piece with respect to the grammar. The method is applied to parse phrases from Bach chorale melodies. The statistical model of music defined by the grammar is also used to evaluate the entropy of the studied...
متن کامل